Isla Vista
Data-driven Learning of Interaction Laws in Multispecies Particle Systems with Gaussian Processes: Convergence Theory and Applications
Feng, Jinchao, Kulick, Charles, Tang, Sui
We develop a Gaussian process framework for learning interaction kernels in multi-species interacting particle systems from trajectory data. Such systems provide a canonical setting for multiscale modeling, where simple microscopic interaction rules generate complex macroscopic behaviors. While our earlier work established a Gaussian process approach and convergence theory for single-species systems, and later extended to second-order models with alignment and energy-type interactions, the multi-species setting introduces new challenges: heterogeneous populations interact both within and across species, the number of unknown kernels grows, and asymmetric interactions such as predator-prey dynamics must be accommodated. We formulate the learning problem in a nonparametric Bayesian setting and establish rigorous statistical guarantees. Our analysis shows recoverability of the interaction kernels, provides quantitative error bounds, and proves statistical optimality of posterior estimators, thereby unifying and generalizing previous single-species theory. Numerical experiments confirm the theoretical predictions and demonstrate the effectiveness of the proposed approach, highlighting its advantages over existing kernel-based methods. This work contributes a complete statistical framework for data-driven inference of interaction laws in multi-species systems, advancing the broader multiscale modeling program of connecting microscopic particle dynamics with emergent macroscopic behavior.
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- North America > United States > California > Santa Barbara County > Isla Vista (0.04)
- North America > United States > California > San Diego County > Vista (0.04)
- (3 more...)
A Sparse Bayesian Learning Algorithm for Estimation of Interaction Kernels in Motsch-Tadmor Model
In this paper, we investigate the data-driven identification of asymmetric interaction kernels in the Motsch-Tadmor model based on observed trajectory data. The model under consideration is governed by a class of semilinear evolution equations, where the interaction kernel defines a normalized, state-dependent Laplacian operator that governs collective dynamics. To address the resulting nonlinear inverse problem, we propose a variational framework that reformulates kernel identification using the implicit form of the governing equations, reducing it to a subspace identification problem. We establish an iden-tifiability result that characterizes conditions under which the interaction kernel can be uniquely recovered up to scale. To solve the inverse problem robustly, we develop a sparse Bayesian learning algorithm that incorporates informative priors for regularization, quantifies uncertainty, and enables principled model selection. Extensive numerical experiments on representative interacting particle systems demonstrate the accuracy, robustness, and interpretability of the proposed framework across a range of noise levels and data regimes.
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- North America > United States > California > Santa Barbara County > Isla Vista (0.04)
- North America > United States > California > San Diego County > Vista (0.04)
- Asia > China > Guangdong Province (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation
Wu, Di, Gu, Jia-Chen, Yin, Fan, Peng, Nanyun, Chang, Kai-Wei
Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks. However, there are significant trustworthiness concerns as RALMs are prone to generating unfaithful outputs, including baseless information or contradictions with the retrieved context. This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decoding dynamics including sequence likelihood, uncertainty quantification, context influence, and semantic alignment to synchronously detect unfaithful sentences. By integrating efficiently measurable and complementary signals, SynCheck enables accurate and immediate feedback and intervention, achieving 0.85 AUROC in detecting faithfulness errors across six long-form retrieval-augmented generation tasks, improving prior best method by 4%. Leveraging SynCheck, we further introduce FOD, a faithfulness-oriented decoding algorithm guided by beam search for long-form retrieval-augmented generation. Empirical results demonstrate that FOD outperforms traditional strategies such as abstention, reranking, or contrastive decoding significantly in terms of faithfulness, achieving over 10% improvement across six datasets.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom (0.14)
- Asia > Russia (0.14)
- (13 more...)
- Personal (1.00)
- Research Report > New Finding (0.66)
- Media > Music (1.00)
- Health & Medicine (1.00)
- Leisure & Entertainment > Sports (0.93)
- (2 more...)
An Optimal Transport Approach for Computing Adversarial Training Lower Bounds in Multiclass Classification
Trillos, Nicolas Garcia, Jacobs, Matt, Kim, Jakwang, Werenski, Matthew
Despite the success of deep learning-based algorithms, it is widely known that neural networks may fail to be robust. A popular paradigm to enforce robustness is adversarial training (AT), however, this introduces many computational and theoretical difficulties. Recent works have developed a connection between AT in the multiclass classification setting and multimarginal optimal transport (MOT), unlocking a new set of tools to study this problem. In this paper, we leverage the MOT connection to propose computationally tractable numerical algorithms for computing universal lower bounds on the optimal adversarial risk and identifying optimal classifiers. We propose two main algorithms based on linear programming (LP) and entropic regularization (Sinkhorn). Our key insight is that one can harmlessly truncate the higher order interactions between classes, preventing the combinatorial run times typically encountered in MOT problems. We validate these results with experiments on MNIST and CIFAR-$10$, which demonstrate the tractability of our approach.
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Massachusetts > Middlesex County > Medford (0.04)
- North America > United States > California > Santa Barbara County > Isla Vista (0.04)
- (3 more...)
A Lexicon for Studying Radicalization in Incel Communities
Klein, Emily, Golbeck, Jennifer
Incels are an extremist online community of men who believe in an ideology rooted in misogyny, racism, the glorification of violence, and dehumanization. In their online forums, they use an extensive, evolving cryptolect - a set of ingroup terms that have meaning within the group, reflect the ideology, demonstrate membership in the community, and are difficult for outsiders to understand. This paper presents a lexicon with terms and definitions for common incel root words, prefixes, and affixes. The lexicon is text-based for use in automated analysis and is derived via a Qualitative Content Analysis of the most frequent incel words, their structure, and their meaning on five of the most active incel communities from 2016 to 2023.
- North America > United States > Vermont > Chittenden County > Burlington (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Texas > El Paso County > El Paso (0.04)
- (6 more...)
- Law Enforcement & Public Safety > Terrorism (0.94)
- Law (0.90)
- Health & Medicine > Therapeutic Area (0.69)
- Education (0.68)
Data-Driven Model Selections of Second-Order Particle Dynamics via Integrating Gaussian Processes with Low-Dimensional Interacting Structures
Feng, Jinchao, Kulick, Charles, Tang, Sui
In this paper, we focus on the data-driven discovery of a general second-order particle-based model that contains many state-of-the-art models for modeling the aggregation and collective behavior of interacting agents of similar size and body type. This model takes the form of a high-dimensional system of ordinary differential equations parameterized by two interaction kernels that appraise the alignment of positions and velocities. We propose a Gaussian Process-based approach to this problem, where the unknown model parameters are marginalized by using two independent Gaussian Process (GP) priors on latent interaction kernels constrained to dynamics and observational data. This results in a nonparametric model for interacting dynamical systems that accounts for uncertainty quantification. We also develop acceleration techniques to improve scalability. Moreover, we perform a theoretical analysis to interpret the methodology and investigate the conditions under which the kernels can be recovered. We demonstrate the effectiveness of the proposed approach on various prototype systems, including the selection of the order of the systems and the types of interactions. In particular, we present applications to modeling two real-world fish motion datasets that display flocking and milling patterns up to 248 dimensions. Despite the use of small data sets, the GP-based approach learns an effective representation of the nonlinear dynamics in these spaces and outperforms competitor methods.
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- North America > United States > California > Santa Barbara County > Isla Vista (0.04)
- North America > United States > California > San Diego County > Vista (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)
- (2 more...)
Identity Construction in a Misogynist Incels Forum
Yoder, Michael Miller, Perry, Chloe, Brown, David West, Carley, Kathleen M., Pruden, Meredith L.
Online communities of involuntary celibates (incels) are a prominent source of misogynist hate speech. In this paper, we use quantitative text and network analysis approaches to examine how identity groups are discussed on incels-dot-is, the largest black-pilled incels forum. We find that this community produces a wide range of novel identity terms and, while terms for women are most common, mentions of other minoritized identities are increasing. An analysis of the associations made with identity groups suggests an essentialist ideology where physical appearance, as well as gender and racial hierarchies, determine human value. We discuss implications for research into automated misogynist hate speech detection.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.28)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (12 more...)
- Law Enforcement & Public Safety > Terrorism (0.69)
- Health & Medicine > Therapeutic Area (0.68)
- Information Technology (0.68)
- Law > Civil Rights & Constitutional Law (0.46)
Learning Transition Operators From Sparse Space-Time Samples
Kümmerle, Christian, Maggioni, Mauro, Tang, Sui
We consider the nonlinear inverse problem of learning a transition operator $\mathbf{A}$ from partial observations at different times, in particular from sparse observations of entries of its powers $\mathbf{A},\mathbf{A}^2,\cdots,\mathbf{A}^{T}$. This Spatio-Temporal Transition Operator Recovery problem is motivated by the recent interest in learning time-varying graph signals that are driven by graph operators depending on the underlying graph topology. We address the nonlinearity of the problem by embedding it into a higher-dimensional space of suitable block-Hankel matrices, where it becomes a low-rank matrix completion problem, even if $\mathbf{A}$ is of full rank. For both a uniform and an adaptive random space-time sampling model, we quantify the recoverability of the transition operator via suitable measures of incoherence of these block-Hankel embedding matrices. For graph transition operators these measures of incoherence depend on the interplay between the dynamics and the graph topology. We develop a suitable non-convex iterative reweighted least squares (IRLS) algorithm, establish its quadratic local convergence, and show that, in optimal scenarios, no more than $\mathcal{O}(rn \log(nT))$ space-time samples are sufficient to ensure accurate recovery of a rank-$r$ operator $\mathbf{A}$ of size $n \times n$. This establishes that spatial samples can be substituted by a comparable number of space-time samples. We provide an efficient implementation of the proposed IRLS algorithm with space complexity of order $O(r n T)$ and per-iteration time complexity linear in $n$. Numerical experiments for transition operators based on several graph models confirm that the theoretical findings accurately track empirical phase transitions, and illustrate the applicability and scalability of the proposed algorithm.
- North America > United States > Minnesota (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (9 more...)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Learning Theory for Inferring Interaction Kernels in Second-Order Interacting Agent Systems
Miller, Jason, Tang, Sui, Zhong, Ming, Maggioni, Mauro
Modeling the complex interactions of systems of particles or agents is a fundamental scientific and mathematical problem that is studied in diverse fields, ranging from physics and biology, to economics and machine learning. In this work, we describe a very general second-order, heterogeneous, multivariable, interacting agent model, with an environment, that encompasses a wide variety of known systems. We describe an inference framework that uses nonparametric regression and approximation theory based techniques to efficiently derive estimators of the interaction kernels which drive these dynamical systems. We develop a complete learning theory which establishes strong consistency and optimal nonparametric min-max rates of convergence for the estimators, as well as provably accurate predicted trajectories. The estimators exploit the structure of the equations in order to overcome the curse of dimensionality and we describe a fundamental coercivity condition on the inverse problem which ensures that the kernels can be learned and relates to the minimal singular value of the learning matrix. The numerical algorithm presented to build the estimators is parallelizable, performs well on high-dimensional problems, and is demonstrated on complex dynamical systems.
- North America > United States > New York (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- (3 more...)